Knowledge-Graph- and GCN-Based Domain Chinese Long Text Classification Method

نویسندگان

چکیده

In order to solve the current problems in domain long text classification tasks, namely, length of a document, which makes it difficult for model capture key information, and lack expert knowledge, leads insufficient accuracy, based on knowledge graph convolutional neural network is proposed. BERT used encode text, each word’s corresponding vector as node so that initialized contains rich semantic information. Using trained entity–relationship extraction model, entity-to-entity–relationships document are extracted edges network, together with syntactic dependency The structure mask learn about edge relationships types further enhance learning ability dependencies between words. method improves accuracy by fusing features data features. Experiments three datasets—IFLYTEK, THUCNews, Chinese corpus Fudan University—show improvements 8.8%, 3.6%, 2.6%, respectively, relative model.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chinese Short Text Classification Based on Domain Knowledge

People are generating more and more short texts. There is an urgent demand to classify short texts into different domains. Due to the shortness and sparseness of short texts, conventional methods based on Vector Space Model (VSM) have limitations. To tackle the data scarcity problem, we propose a new model to directly measure the correlation between a short text instance and a domain instead of...

متن کامل

Some Studies on Chinese Domain Knowledge Dictionary and Its Application to Text Classification

In this paper, we study some issues on Chinese domain knowledge dictionary and its application to text classification task. First a domain knowledge hierarchy description framework and our Chinese domain knowledge dictionary named NEUKD are introduced. Second, to alleviate the cost of construction of domain knowledge dictionary by hand, we use a boostrapping-based algorithm to learn new domain ...

متن کامل

Improved Graph Based K-NN Text Classification

This paper presents an improved graph based k-nn algorithm for text classification. Most of the organization are facing problem of large amount of unorganized data. Most of the existing text classification techniques are based on vector space model which ignores the structural information of the document which is the word order or the co-occurrences of the terms or words. In this paper we have ...

متن کامل

Domain Based Classification of Punjabi Text Documents

With the dramatic increase in the amount of content available in digital forms gives rise to a problem to manage this online textual data. As a result, it has become a necessary to classify large texts (documents) into specific classes. And Text Classification is a text mining technique which is used to classify the text documents into predefined classes. Most text classification techniques wor...

متن کامل

A Knowledge-based Approach to Text Classification

The paper presents a simple and effective knowledge-based approach for the task of text classification. The approach uses topic identification algorithm named FIFA to text classification. In this paper the basic process of text classification task and FIFA algorithm are described in detail. At last some results of experiment and evaluations are discussed.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2023

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app13137915